Clustering Time-Varying Gene Expression Pro les using Scale-space Signals

نویسنده

  • Tanveer Syeda-Mahmood
چکیده

The functional state of an organism is determined largely by the pattern of expression of its genes. The analysis of gene expression data from gene chips has primarily revolved around clustering and classi cation of the data using machine learning techniques based on the intensity of expression alone with the time-varying pattern mostly ignored. In this paper, we present a pattern recognition-based approach to capturing similarity by nding salient changes in the time-varying expression patterns of genes. Such changes can give clues about important events, such as gene regulation by cell-cycle phases, or even signal the onset of a disease. Speci cally, we observe that dis-similarity between time series is revealed by the sharp twists and bends produced in a higher-dimensional curve formed from the constituent signals. Scale-space analysis is used to detect the sharp twists and turns and their relative strength with respect to the component signals is estimated to form a shape similarity measure between time pro les. A clustering algorithm is presented to cluster gene pro les using the scale-space distance as a similarity metric. Multi-dimensional curves formed from time series within clusters are used as cluster prototypes or indexes to the gene expression database, and are used to retrieve the functionally similar genes to a query gene pro le. Extensive comparison of clustering using scale-space distance in comparison to traditional Euclidean distance is presented on the yeast genome database.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering Time-Varying Gene Expression Profiles using Scale-space Signals

The functional state of an organism is determined largely by the pattern of expression of its genes. The analysis of gene expression data from gene chips has primarily revolved around clustering and classification of the data using machine learning techniques based on the intensity of expression alone with the time-varying pattern mostly ignored. In this paper, we present a pattern recognition-...

متن کامل

Modèles d'intégration de la connaissance pour la fouille des données d'expression des gènes. (Knowledge Integration Models for Mining Gene Expression Data)

In the framework of this thesis we develop new data mining models for knowledge discovery with gene expression pro…les. Data mining is the science of automatically extracting knowledge hidden in large data sets. Gene expression technologies are powerful methods for studying biological processes through a transcriptional point of view. These technologies have produced vast amounts of data by mea...

متن کامل

Genome-Scale Gene Expression Pro les Mapped onto the Pathway and Genome Maps in KEGG

The emerging technology of DNA chips and microarrays makes it possible to simultaneously analyze the expression of many genes, such as the whole set of genes in the completely sequenced genome. We have been developing a system for network oriented analysis and visualization of such genome-scale gene expression data. It projects the time-series data of gene expression pro les on the functional r...

متن کامل

Super paramagnetic clustering of yeast gene expression pro les

High density DNA arrays used to monitor gene expression at a genomic scale have produced vast amounts of information which require the development of e cient computational methods to analyze them The important rst step is to extract the fundamental patterns of gene expression inherent in the data This paper de scribes the application of a novel clustering algorithm Super Paramagnetic Cluster in...

متن کامل

EXCAVATOR: a computer program for ef®ciently mining gene expression data

Massive amounts of gene expression data are generated using microarrays for functional studies of genes and gene expression data clustering is a useful tool for studying the functional relationship among genes in a biological process. We have developed a computer package EXCAVATOR for clustering gene expression pro®les based on our new framework for representing gene expression data as a minimu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003